Parallel Fast Linear Space Alignment
نویسندگان
چکیده
We introduce a new parallel algorithm for pairwise sequence alignment, called Parallel FastLSA, that finds optimal answers, uses linear space with respect to problem size and number of processors, and achieves good speedups in practice. Computational biologists use sequence alignment algorithms to compare sequences of DNA (or proteins). Since the sequences can be long, space efficiency and turnaround time can be important, especially for interactive use. We have experimented with Parallel FastLSA on an SGI Origin 2400 shared-memory parallel computer using a variety of mammalian DNA sequences from external research groups. The algorithm achieves good speedups on problems of sufficient granularity. We demonstrate speedups of between 3.73 (small problem) and 7.50 (large problem) on 8 processors, and up to 11.28 on 16 processors.
منابع مشابه
gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملDACIDR: Deterministic Annealed Clustering with Interpolative Dimension Reduction using Large Collection of 16S rRNA Sequences
The development of next-generation sequencing technology has made it possible to generate millions of sequences from environmental samples. However, the difficulty associated with taxonomy-independent analysis increases as the sequence size expands. Most of the existing algorithms, which aim to generate operational taxonomic units (OTUs), require quadratic space and time complexity that makes t...
متن کاملParallel Sequence Alignment in Limited Space
Sequence comparison with affine gap costs is a problem that is readily parallelizable on simple single-instruction, multiple-data stream (SIMD) parallel processors using only constant space per processing element. Unfortunately, the twin problem of sequence alignment, finding the optimal character-by-character correspondence between two sequences, is more complicated. While the innovative O(n2)...
متن کاملSSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications
BACKGROUND The Smith-Waterman algorithm, which produces the optimal pairwise alignment between two sequences, is frequently used as a key component of fast heuristic read mapping and variation detection tools for next-generation sequencing data. Though various fast Smith-Waterman implementations are developed, they are either designed as monolithic protein database searching tools, which do not...
متن کاملReduced space sequence alignment
MOTIVATION Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n...
متن کامل